-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Add OrtExternalResourceImporter API for D3D12 shared resource import #26828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Introduces the OrtExternalResourceImporter API enabling execution providers to import D3D12 shared resources and timeline fences for zero-copy GPU-to-GPU data sharing with ORT inference. Public API additions: - OrtExternalResourceImporter capability object - OrtExternalMemoryHandle for imported D3D12 allocations - OrtExternalSemaphoreHandle for imported D3D12 timeline fences - SessionGetEpDeviceForOutputs to query output EP device placement - RunOptions_SetSyncStream to associate sync stream for async execution EP Plugin API: - OrtExternalResourceImporterImpl interface for EP implementations - OrtEpFactory::CreateExternalResourceImporterForDevice extension Design: - No GPU virtual addresses in public API - EP-agnostic design allows any EP to implement import - Capability discovery with explicit ORT_NOT_IMPLEMENTED - Follows existing patterns (Allocator, DataTransfer, SyncStream) Includes example_plugin_ep mock implementation and autoep tests.
onnxruntime/core/session/plugin_ep/ep_factory_provider_bridge.h
Outdated
Show resolved
Hide resolved
- Deleted the sync_stream member from OrtRunOptions structure. - Removed the RunOptions_SetSyncStream API and its implementation. - Updated related C++ API and example implementations to reflect the removal of sync stream functionality. - Adjusted tests to remove references to RunOptions_SetSyncStream. - Introduced new structures for external memory and semaphore handles to improve resource management. - Ensured backward compatibility by checking EP version support for external resource import.
…l resource structs
onnxruntime/test/autoep/library/example_plugin_ep/ep_external_resource_importer.h
Outdated
Show resolved
Hide resolved
yuslepukhin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
![]()
|
@skottmckay / @yuslepukhin - any concerns with getting this in this week prior to the 1.24 snap? I think this provides a sufficient baseline for achieving the core of the asks around shared resources and we continue iterating on specific EP implementations with this foundation. |
|
Looks great to me |
|
@gaugarg-nv and @gedoensmax can you please take a look and let us know if there are any gaps/issues with this approach? |
- Added `ep_interop_api.h` to define the Interop API for external resource importers. - Implemented functions for creating and managing external resource importers, including memory and semaphore import capabilities. - Updated `onnxruntime_c_api.cc` to integrate the new Interop API, replacing previous external resource importer implementations. - Modified `ort_apis.h` to declare the new Interop API functions. - Refactored tests in `test_external_resource_importer.cc` to utilize the new Interop API for external resource importer operations.
Resolved API conflicts by placing KernelInfo APIs before Interop APIs
- Return ORT_NOT_IMPLEMENTED status instead of nullptr when EP doesn't support external resource import
- Rename ep_interop_api.{cc,h} to interop_api.{cc,h} to match the generic OrtInteropApi naming
- Update documentation to reflect the new error handling behavior
…eExternalResourceImporterForDevice Capability discovery APIs should return success with nullptr output when a feature is unsupported, rather than an error status. This allows simple "if (out != nullptr)" checks without needing to distinguish ORT_NOT_IMPLEMENTED from real errors. - Update tests to assert status is nullptr and skip when importer is nullptr
onnxruntime/test/autoep/library/example_plugin_ep/ep_external_resource_importer.cc
Show resolved
Hide resolved
onnxruntime/test/autoep/library/example_plugin_ep/ep_external_resource_importer.h
Fixed
Show fixed
Hide fixed
|
@skottmckay thanks for tagging, this looks good to me. One question i would have is if we could handle e.g. overallocation of allocation with a callback could work. Usually one can simply bin a memory info and rely on ORT to handle the correct allocation size. With this approach it will require preallocation and importing the memory beforehand with the correct shape i assume. |
Right, good callout, the current design is intentionally kept simple and focused on the core import-existing-memory scenario for basic zero-copy interop. The dual offset pattern OrtExternalMemoryDescriptor::offset_bytes + OrtExternalTensorDescriptor::offset_bytes does allow importing a larger buffer and carving out regions, but I agree it's not quite the same as ORT-driven binning strategies. This is definitely something we can build on top of in the future, through allocator callbacks or other extensions. |
skottmckay
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
![]()
Description
Introduces the OrtExternalResourceImporter API enabling execution providers to import D3D12 shared resources and timeline fences for zero-copy GPU-to-GPU data sharing with ORT inference.
Public API additions:
EP Plugin API:
Design:
Includes example_plugin_ep mock implementation and autoep tests.
Motivation and Context
#26821